Multi-task Deep Neural Networks in Automated Protein Function Prediction
نویسندگان
چکیده
Background: In recent years, deep learning algorithms have outperformed the state-of-the art methods in several areas such as computer vision, speech recognition thanks to the efficient methods for training and for preventing overfitting, advancement in computer hardware and the availability of vast amount data. The high performance of multi-task deep neural networks in drug discovery has attracted the attention to deep learning algorithms in the bioinformatics area. Protein function prediction is a crucial research area where more accurate prediction methods are still needed. Here, we proposed a hierarchical multi-task deep neural network architecture based on Gene Ontology (GO) terms as a solution to the protein function prediction problem and investigated various aspects of the proposed architecture by performing several experiments. ∗A short version of this manuscript was accepted for oral presentation at ISMB/ECCB 2017 Function-COSI meeting †[email protected] ‡[email protected] §[email protected] ¶[email protected] ‖[email protected] 1 ar X iv :1 70 5. 04 80 2v 2 [ qbi o. Q M ] 2 8 M ay 2 01 7 Results: First, we showed that there is a positive correlation between the performance of the system and the size of training datasets. Second, we investigated whether the level of GO terms on the GO hierarchy is related to their performance. We showed that there is no relation between the depth of GO terms on the GO hierarchy (i.e. general/specific) and their performance. In addition, we included all annotations to the training of a set of GO terms to investigate whether including noisy data to the training datasets change the performance of the system. The results showed that including less reliable annotations in training of deep neural networks increased the performance of the low performed GO terms, significantly. Finally, we evaluated the performance of the system using hierarchical evaluation method. Mathews correlation coefficients was calculated as 0.75, 0.49 and 0.63 for molecular function, biological process and cellular component categories, respectively. Conclusions: We showed that deep learning algorithms have a great potential in protein function prediction area. We plan to further improve the DEEPred by including other types of annotations from various biological data sources. Finally, we plan to construct DEEPred as an open access online tool.
منابع مشابه
Multi-Step-Ahead Prediction of Stock Price Using a New Architecture of Neural Networks
Modelling and forecasting Stock market is a challenging task for economists and engineers since it has a dynamic structure and nonlinear characteristic. This nonlinearity affects the efficiency of the price characteristics. Using an Artificial Neural Network (ANN) is a proper way to model this nonlinearity and it has been used successfully in one-step-ahead and multi-step-ahead prediction of di...
متن کاملTopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions
Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the geometric and biological complexity. To address this problem we introduce the element-specific persistent homology (ESPH) method. ESPH represents 3D co...
متن کاملA multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملComparison of Artificial Neural Networks and Cox Regression Models in Prediction of Kidney Transplant Survival
Cox regression model serves as a statistical method for analyzing the survival data, which requires some options such as hazard proportionality. In recent decades, artificial neural network model has been increasingly applied to predict survival data. This research was conducted to compare Cox regression and artificial neural network models in prediction of kidney transplant survival. The prese...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017